Sketched SVD: Recovering Spectral Features from Compressive Measurements
نویسندگان
چکیده
carries important information about the structure of the data set, especially when the rank k of X is small. In particular, the columns of U (known as the left singular vectors of X) span the principal directions of the data set and can be used as basis vectors for building up typical signals, and the diagonal entries of Σ (known as the singular values of X) reflect the energy of the data set in each of these directions. The extraction of these features is commonly known as Principal Component Analysis (PCA), and PCA is a fundamental and commonly used tool in data analysis and compression. This exact same process can be viewed through a slightly different lens when one imagines the columns of X as independent realizations of a length-N random vector x. Computing the left singular vectors of X is equivalent to computing the eigenvectors of XX , which (up to rescaling) is the N×N sample covariance matrix of the data. In this context, PCA is also known as the Karhunen-Loève (KL) Transform. There are in fact a number of applications where the right singular vectors V of a data matrix X are more important, or equivalently, where the eigenvectors of XX carry the structure of interest. For example, the product ΣV T gives a low-dimensional embedding of the data set that preserves distances and angles between the n data vectors. This embedding can be used for clustering or categorizing the signals; for example, this is used for comparing documents in latent semantic analysis. The right singular vectors of X can also be viewed as the result of applying the KL Transform to the rows, rather than the columns, of X . In this sense, the columns of V describe the inter-signal (rather than intra-signal) statistical correlations. In cases where the column index corresponds to a distinct sensor position, or a vertex in a graph, etc., this correlation structure can carry important structural information. Unfortunately, many of the applications in which we seek the right singular vectors of X are those in which the data is simply too large, too distributed, or generated too quickly for us to store the data or to process it efficiently in one, centralized location. There are, however, settings in which the data sets—while large—have low intrinsic dimension or are of low rank. Let us suppose that the length of each data vector N is much larger than the number of observations n, and suppose that X has rank k ≤ n. The data may or may not be generated in a dynamic, streaming fashion and it may or may not be collected in a distributed fashion amongst n sensors. We wish to design a joint observation process (which can be distributed amongst n sensors) that maintains a “sketch” of the data stream and a reconstruction process that, at a central location, reconstructs not an approximation of the original data, but rather a good approximation to the singular values σj and the (right) singular vectors vj of the original matrix X . The sketch of the data stream should be a linear, non-adaptive procedure, one that is efficient to update, and one that uses as few observations of the data matrix as possible so that as little communication as possible is required from the sensors to the central processing entity. Because the procedure is linear and non-adaptive, we can represent the sketch as a matrixmatrix product ΦX = Y , where the observation matrix Φ is of size m×N and the sketch Y is of size m× n. (The fact that the sketch is one-sided allows it to be computed sensor-by-sensor in distributed data collection settings.) We want m as small as possible. From the sketch matrix Y , we want to produce estimates σ′ j of the k non-zero singular values and estimates v′ j of the associated (right) singular vectors of X . We do this simply by computing the SVD of the sketch matrix Y , using standard (iterative) SVD algorithms. Our analysis is quite different from that of most randomized linear algebra methods, low-rank matrix approximations, robust PCA, rankrevealing QR factorizations, etc. We assume that the sketching matrix Φ is randomly generated and satisfies the distributional JohnsonLindenstrauss (JL) property so that with high probability it acts as a near isometry on the column span of X , and we then exploit relative error (as opposed to absolute error) perturbation analysis for deterministic (as opposed to random) matrices to obtain our results. If m = O(k −2(log(1/ ) + log(1/δ)), where k denotes the rank of X , then with probability at least 1− δ, the singular values σ′ j of Y satisfy the following relative error result
منابع مشابه
Compressive hyperspectral imaging via adaptive sampling and dictionary learning
In this paper, we propose a new sampling strategy for hyperspectral signals that is based on dictionary learning and singular value decomposition (SVD). Specifically, we first learn a sparsifying dictionary from training spectral data using dictionary learning. We then perform an SVD on the dictionary and use the first few left singular vectors as the rows of the measurement matrix to obtain th...
متن کاملSingle image super resolution using compressive K-SVD and fusion of sparse approximation algorithms
Super Resolution based on Compressed Sensing (CS) considers low resolution (LR) image patch as the compressive measurement of its corresponding high resolution (HR) patch. In this paper we propose a single image super resolution scheme with compressive K-SVD algorithm(CKSVD) for dictionary learning incorporating fusion of sparse approximation algorithms to produce better results. The CKSVD algo...
متن کاملCompressive spectral embedding: sidestepping the SVD
Spectral embedding based on the Singular Value Decomposition (SVD) is a widely used “preprocessing” step in many learning tasks, typically leading to dimensionality reduction by projecting onto a number of dominant singular vectors and rescaling the coordinate axes (by a predefined function of the singular value). However, the number of such vectors required to capture problem structure grows w...
متن کاملSpectral Estimation of Stationary Time Series: Recent Developments
Spectral analysis considers the problem of determining (the art of recovering) the spectral content (i.e., the distribution of power over frequency) of a stationary time series from a finite set of measurements, by means of either nonparametric or parametric techniques. This paper introduces the spectral analysis problem, motivates the definition of power spectral density functions, and reviews...
متن کاملMultibiometric Template Security Using CS Theory – SVD Based Fragile Watermarking Technique
Protection of biometric template against spoofing or modification attack at system database is major issue in multibiometric system. Hence fragile digital watermarking technique is one of the solutions for biometric template protection against these attacks. In this paper, fingerprint watermarking technique based on SVD and Compressive Sensing theory proposed for protection of biometric templat...
متن کاملSparse Signal Recovery from Fixed Low-Rank Subspace via Compressive Measurement
This paper designs and evaluates a variant of CoSaMP algorithm, for recovering the sparse signal s from the compressive measurement ( ) v Uw s given a fixed lowrank subspace spanned by U. Instead of firstly recovering the full vector then separating the sparse part from the structured dense part, the proposed algorithm directly works on the compressive measurement to do the separation. We i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1211.0361 شماره
صفحات -
تاریخ انتشار 2012